AITopics | cloud llm

Collaborating Authors

cloud llm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CORE: Reducing UI Exposure in Mobile Agents via Collaboration Between Cloud and Local LLMs

Fan, Gucongcong, Niu, Chaoyue, Lyu, Chengfei, Wu, Fan, Chen, Guihai

arXiv.org Artificial IntelligenceOct-20-2025

Mobile agents rely on Large Language Models (LLMs) to plan and execute tasks on smartphone user interfaces (UIs). While cloud-based LLMs achieve high task accuracy, they require uploading the full UI state at every step, exposing unnecessary and often irrelevant information. In contrast, local LLMs avoid UI uploads but suffer from limited capacity, resulting in lower task success rates. We propose $\textbf{CORE}$, a $\textbf{CO}$llaborative framework that combines the strengths of cloud and local LLMs to $\textbf{R}$educe UI $\textbf{E}$xposure, while maintaining task accuracy for mobile agents. CORE comprises three key components: (1) $\textbf{Layout-aware block partitioning}$, which groups semantically related UI elements based on the XML screen hierarchy; (2) $\textbf{Co-planning}$, where local and cloud LLMs collaboratively identify the current sub-task; and (3) $\textbf{Co-decision-making}$, where the local LLM ranks relevant UI blocks, and the cloud LLM selects specific UI elements within the top-ranked block. CORE further introduces a multi-round accumulation mechanism to mitigate local misjudgment or limited context. Experiments across diverse mobile apps and tasks show that CORE reduces UI exposure by up to 55.6% while maintaining task success rates slightly below cloud-only agents, effectively mitigating unnecessary privacy exposure to the cloud. The code is available at https://github.com/Entropy-Fighter/CORE.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.15455

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Africa > South Africa (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Services (0.51)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback

CON-QA: Privacy-Preserving QA using cloud LLMs in Contract Domain

Singh, Ajeet Kumar, Surya, Rajsabi, Tripathi, Anurag, Choudhury, Santanu, Bisane, Sudhir

arXiv.org Artificial IntelligenceSep-25-2025

As enterprises increasingly integrate cloud-based large language models (LLMs) such as ChatGPT and Gemini into their legal document workflows, protecting sensitive contractual information - including Personally Identifiable Information (PII) and commercially sensitive clauses - has emerged as a critical challenge. In this work, we propose CON-QA, a hybrid privacy-preserving framework designed specifically for secure question answering over enterprise contracts, effectively combining local and cloud-hosted LLMs. The CON-QA framework operates through three stages: (i) semantic query decomposition and query-aware document chunk retrieval using a locally deployed LLM analysis, (ii) anonymization of detected sensitive entities via a structured one-to-many mapping scheme, ensuring semantic coherence while preventing cross-session entity inference attacks, and (iii) anonymized response generation by a cloud-based LLM, with accurate reconstruction of the original answer locally using a session-consistent many-to-one reverse mapping. To rigorously evaluate CON-QA, we introduce CUAD-QA, a corpus of 85k question-answer pairs generated over 510 real-world CUAD contract documents, encompassing simple, complex, and summarization-style queries. Empirical evaluations, complemented by detailed human assessments, confirm that CON-QA effectively maintains both privacy and utility, preserves answer quality, maintains fidelity to legal clause semantics, and significantly mitigates privacy risks, demonstrating its practical suitability for secure, enterprise-level contract documents.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.19925

Country:

North America (0.04)
Asia > India > NCT > New Delhi (0.04)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Principle-Guided Verilog Optimization: IP-Safe Knowledge Transfer via Local-Cloud Collaboration

Wang, Jing, Li, Zheng, Li, Lei, He, Fan, Lin, Liyu, Lai, Yao, Li, Yan, Zeng, Xiaoyang, Guo, Yufeng

arXiv.org Artificial IntelligenceAug-11-2025

Recent years have witnessed growing interest in adopting large language models (LLMs) for Register Transfer Level (RTL) code optimization. While powerful cloud-based LLMs offer superior optimization capabilities, they pose unacceptable intellectual property (IP) leakage risks when processing proprietary hardware designs. In this paper, we propose a new scenario where Verilog code must be optimized for specific attributes without leaking sensitive IP information. We introduce the first IP-preserving edge-cloud collaborative framework that leverages the benefits of both paradigms. Our approach employs local small LLMs (e.g., Qwen-2.5-Coder-7B) to perform secure comparative analysis between paired high-quality target designs and novice draft codes, yielding general design principles that summarize key insights for improvements. These principles are then used to query stronger cloud LLMs (e.g., Deepseek-V3) for targeted code improvement, ensuring that only abstracted and IP-safe guidance reaches external services. Our experimental results demonstrate that the framework achieves significantly higher optimization success rates compared to baseline methods. For example, combining Qwen-2.5-Coder-7B and Deepseek-V3 achieves a 66.67\% optimization success rate for power utilization, outperforming Deepseek-V3 alone (49.81\%) and even commercial models like GPT-4o (55.81\%). Further investigation of local and cloud LLM combinations reveals that different model pairings exhibit varying strengths for specific optimization objectives, with interesting trends emerging when varying the number of comparative code pairs. Our work establishes a new paradigm for secure hardware design optimization that balances performance gains with IP protection.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2508.05675

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Semiconductors & Electronics (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MedOrchestra: A Hybrid Cloud-Local LLM Approach for Clinical Data Interpretation

Lee, Sihyeon, Song, Hyunjoo, Lee, Jong-chan, Lee, Yoon Jin, Lee, Boram, Lim, Hee-Eon, Kim, Dongyeong, Seo, Jinwook, Kim, Bohyoung

arXiv.org Artificial IntelligenceJun-2-2025

Deploying large language models (LLMs) in clinical settings faces critical trade-offs: cloud LLMs, with their extensive parameters and superior performance, pose risks to sensitive clinical data privacy, while local LLMs preserve privacy but often fail at complex clinical interpretation tasks. We propose MedOrchestra, a hybrid framework where a cloud LLM decomposes complex clinical tasks into manageable subtasks and prompt generation, while a local LLM executes these subtasks in a privacy-preserving manner. Without accessing clinical data, the cloud LLM generates and validates subtask prompts using clinical guidelines and synthetic test cases. The local LLM executes subtasks locally and synthesizes outputs generated by the cloud LLM. We evaluate MedOrchestra on pancreatic cancer staging using 100 radiology reports under NCCN guidelines. On free-text reports, MedOrchestra achieves 70.21% accuracy, outperforming local model baselines (without guideline: 48.94%, with guideline: 56.59%) and board-certified clinicians (gastroenterologists: 59.57%, surgeons: 65.96%, radiologists: 55.32%). On structured reports, MedOrchestra reaches 85.42% accuracy, showing clear superiority across all settings.

large language model, machine learning, medorchestra, (20 more...)

arXiv.org Artificial Intelligence

2505.23806

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > Spain (0.04)
Europe > Italy > Tuscany > Florence (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Hybrid Inference for LLMs: Reward-Based Token Modelling with Selective Cloud Assistance

MS, Adarsh, VG, Jithin, PS, Ditto

arXiv.org Artificial IntelligenceSep-15-2024

Large language models (LLMs) are known for their exceptional performance across a range of natural language processing tasks, but their deployment comes at a high computational and financial cost. On the other hand, smaller language models (SLMs), which can be deployed on lower-cost edge devices, struggle to match the performance of their larger counterparts. This paper presents a novel hybrid inference approach that leverages the strengths of both model types while minimizing reliance on costly cloud-based LLMs. Unlike existing methods that route entire queries to either an SLM or a cloud LLM, our approach introduces a reward-based mechanism to dynamically determine the involvement of the cloud LLM during token generation. Specifically, each token predicted by the SLM is evaluated against a reward score, and only when this score falls below a certain threshold is the cloud LLM consulted for assistance in the next token prediction. This method not only reduces the traffic to the cloud LLM, thereby lowering costs, but also allows for flexible control over response quality depending on the reward score threshold. Experimental results demonstrate that our approach significantly reduces cloud LLM usage with minimal impact on overall response quality, offering a cost-effective solution for deploying high-performance language models.

cloud llm, reward model, threshold, (15 more...)

arXiv.org Artificial Intelligence

2409.13757

Genre: Research Report > New Finding (0.34)

Industry: Information Technology > Services (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Enabling On-Device LLMs Personalization with Smartphone Sensing

Zhang, Shiquan, Ma, Ying, Fang, Le, Jia, Hong, D'Alfonso, Simon, Kostakos, Vassilis

arXiv.org Artificial IntelligenceJul-5-2024

This demo presents a novel end-to-end framework that combines on-device large language models (LLMs) with smartphone sensing technologies to achieve context-aware and personalized services. The framework addresses critical limitations of current personalization solutions via cloud-based LLMs, such as privacy concerns, latency and cost, and limited personal sensor data. To achieve this, we innovatively proposed deploying LLMs on smartphones with multimodal sensor data and customized prompt engineering, ensuring privacy and enhancing personalization performance through context-aware sensing. A case study involving a university student demonstrated the proposed framework's capability to provide tailored recommendations. In addition, we show that the proposed framework achieves the best trade-off in privacy, performance, latency, cost, battery and energy consumption between on-device and cloud LLMs. Future work aims to integrate more diverse sensor data and conduct large-scale user studies to further refine the personalization. We envision the proposed framework could significantly improve user experiences in various domains such as healthcare, productivity, and entertainment by providing secure, context-aware, and efficient interactions directly on users' devices.

arxiv preprint arxiv, llm, smartphone, (11 more...)

arXiv.org Artificial Intelligence

2407.04418

Country:

Oceania > Australia > Victoria > Melbourne (0.07)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Information Technology > Services (0.67)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)

Add feedback

Privacy Preserving Prompt Engineering: A Survey

Edemacu, Kennedy, Wu, Xintao

arXiv.org Artificial IntelligenceApr-10-2024

Pre-trained language models (PLMs) have demonstrated significant proficiency in solving a wide range of general natural language processing (NLP) tasks. Researchers have observed a direct correlation between the performance of these models and their sizes. As a result, the sizes of these models have notably expanded in recent years, persuading researchers to adopt the term large language models (LLMs) to characterize the larger-sized PLMs. The size expansion comes with a distinct capability called in-context learning (ICL), which represents a special form of prompting and allows the models to be utilized through the presentation of demonstration examples without modifications to the model parameters. Although interesting, privacy concerns have become a major obstacle in its widespread usage. Multiple studies have examined the privacy risks linked to ICL and prompting in general, and have devised techniques to alleviate these risks. Thus, there is a necessity to organize these mitigation techniques for the benefit of the community. This survey provides a systematic overview of the privacy protection methods employed during ICL and prompting in general. We review, analyze, and compare different methods under this paradigm. Furthermore, we provide a summary of the resources accessible for the development of these frameworks. Finally, we discuss the limitations of these frameworks and offer a detailed examination of the promising areas that necessitate further exploration.

dataset, language model, llm, (15 more...)

arXiv.org Artificial Intelligence

2404.06001

Country:

North America > United States > California (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arkansas (0.04)
(4 more...)

Genre: Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback